Skip to main content

Reply to this post | Go Back
View Post [edit]

Poster: nathanshor Date: Dec 20, 2002 4:30pm
Forum: web Subject: bad archiving

Do you ever notice that the archive sometimes doesnt get all the pictures, flash, or advanced J-script and DHTML on all the archived websites, and and the links are at many times broken?

Reply to this post
Reply [edit]

Poster: Ustice Date: Dec 31, 2002 6:28am
Forum: web Subject: Re: bad archiving

I noticed this while trying to find back issues of Pibgorn, a really cute serial comic strip on . I was REALLY hopeful that I would be able to find the jpgs there, though I guess that is too much to ask for to archive pic like that.

If anyone DOES have the early ones, I would appreciate it if you could email me at jXereXmywinXder@XyaXhoo.coXm (remove the X's) so that I could fill out my own personal archive. Thanks.

Reply to this post
Reply [edit]

Poster: brewster Date: Dec 31, 2002 12:00pm
Forum: web Subject: On images in the web collection

We apologize for the lack of images in some sections of the collection. The percentage of the images depends on a couple of factors:
* robot exclusions for /images is common and so we did not collect them.
* at different times, storage cost more or less and this changed what percentage of images we collected. For instance, 1997 and 1998 were quite good years for images, but 1999 is not, and the images from 2000 are still somewhat on tape.

We are now collecting images better, we hope. (more accurately, it is alexa that is doing most of the collecting).

To find images in the collection, I suggest you try using the more general query mechanism:*/*

there are ways to get longer lists of results by looking at the arguments that come back as well.


Reply to this post
Reply [edit]

Poster: DVguru Date: Feb 21, 2003 7:58pm
Forum: web Subject: Re: On images in the web collection

In defense of the archivers, there isn't a storage medium in existence that is big enough to store this much data! That's like asking for an offline Internet! its a huge amount of data, put quite simply.

Maybe one day eh?....(imagine 800 terabytes of space! what comes after tera? i forget, and im too lazy to go find out. Its all greek to me.)

My personal website had 3Mbs of images, not including MP3s of my songs (yes! mine! i made them myself...), and would not benefit a single person who browses this archive, in fact, its actually illegal since you'd be re-presenting copyrighted material, but that would be a whole new thread!
but to put just the html files on would be, say 20k and have nearly all the info of the full site, at (20/3072)*100=

0.65% of the size of the whole site

kinda makes sense when its put like that eh?